feat: expose comet metrics through Sparks external monitoring system#3708
feat: expose comet metrics through Sparks external monitoring system#3708coderfender wants to merge 6 commits intoapache:mainfrom
Conversation
aa97dec to
f096d8f
Compare
|
cc : @andygrove |
spark/src/main/scala/org/apache/comet/rules/CometExecRule.scala
Outdated
Show resolved
Hide resolved
f096d8f to
c0eb149
Compare
|
@wForget , Please take a look whenever you get a chance |
c0eb149 to
2eb506f
Compare
spark/src/main/scala/org/apache/comet/rules/CometExecRule.scala
Outdated
Show resolved
Hide resolved
8c60dc8 to
eaa1dc4
Compare
|
@wForget , I addressed code per your review comments. Please take a look whenever you get a chance. Thank you for the kind guidance |
| } | ||
| }) | ||
|
|
||
| def recordStats(stats: CometCoverageStats): Unit = { |
There was a problem hiding this comment.
This isn't exactly ideal. These stats make sense per plan. Here we are incrementing all metrics globally, so all plans that are executed across all sessions in a Spark application will have their counts accumulated and may provide no useful information.
There was a problem hiding this comment.
Sure. But spark does follow a similar approach here but there is always an option to improve granularity
There was a problem hiding this comment.
I wasn't saying the approach is wrong. Just pointing out that for these specific metrics, if the Spark application has multiple plans, the accumulated values are not necessarily meaningful.
|
Thwnk you for the review . Let me take a look and update the logic |
c6da902 to
2b6dcc1
Compare
| } | ||
|
|
||
| object CometDriverPlugin extends Logging { | ||
| def registerCometMetrics(sc: SparkContext): Unit = { |
There was a problem hiding this comment.
Add a configuration to enable this behavior.
|
Thanks @coderfender , overall looks good to me; just need to add a configuration to control this behavior, and keep it disabled by default. |
|
@wForget , thank you for the approval. My preference here is to enable metrics reporting by default unless you think that we would be compromising performance in collecting the metrics ? |
|
Which issue does this PR close?
Closes #3712
Rationale for this change
The following metrics are exposed at the moment :
What changes are included in this PR?
How are these changes tested?